Method of filtering a bitstream according to user specifications

ABSTRACT

The invention proposes a method of filtering a bitstream according to user specifications. The proposed method uses a semantic description and a syntactical description of the bitstream. The semantic description is scanned to select the elementary units that match the user specification. The time is used as a linking mechanism between the semantic description and the syntactical description to locate the elements of the syntactical description that are to be removed. A filtered syntactical description is generated by removing the located elements. Finally, a filtered bitstream is generated from the filtered syntactical description.

FIELD OF THE INVENTION

The invention relates to a method of filtering a bitstream using asyntactical description of said bitstream and at least a userspecification.

The invention also relates to a device comprising means for implementingsuch a filtering method.

The invention also relates to a system comprising a server device, atransmission channel and a user device wherein said server and/or saiduser devices comprise means for implementing such a filtering method.

The invention also relates to a program comprising instructions forimplementing such a filtering method when said program is executed by aprocessor.

The invention also relates to a filtered bitstream obtained by applyingsuch a filtering method.

The invention allows filtering out undesired scenes in a video, forexample in a video streamed via the internet or transmitted via cablenetwork or any other type of network. It may be used to implement aparental control, for example, for skipping scenes having a violent orsexual connotation.

BACKGROUND OF THE INVENTION

Such a filtering method is described in the ISO document “Proposal of aGeneric Bitstream Syntax Description Language” by J. Heuer, A. Hutter,G. Panis, H. Hellwagner, H. Kosch and C. Timmerer (reference ISO/IECJTC1/SC29/WG11 MPEG02/M8291 Fairfax/May 2002).

In this ISO document, it is proposed to act on a syntactical descriptionof the bitstream rather than on the bitstream itself. A syntacticaldescription is defined as being an XML document describing thehigh-level structure of the bitstream. The proposed syntacticaldescription comprises elements that are marked with semanticallymeaningful data. The proposed method consists in definingtransformations aimed at removing the elements that are marked with aspecific marker from the syntactical description. Then a filteredbitstream is generated from the transformed syntactical description.

An advantage of such a solution is that it generates a filteredbitstream in which the prohibited passages are removed.

This solution uses specific markers and specific transformationsassociated with said specific markers.

The invention proposes an alternative solution that avoids being limitedto predefined markers.

SUMMARY OF THE INVENTION

According to the invention, a method of filtering a bitstream comprisingelementary units having a time position, and first timing dataindicative of said time positions, uses:

a syntactical description of said bitstream, said syntacticaldescription comprising elements describing said elementary units andcontaining said first timing data,

a semantic description of said bitstream, said semantic descriptioncomprising second timing data and characterizing data relating to one ormore elementary units, said second timing data being indicative of thetime positions of said elementary units,

at least a user specification,

and comprises the steps of:

searching in said semantic description for the characterizing data thatmatch said user specification to identify matching elementary units,

deriving time positions for said matching elementary units from saidsecond timing data,

using said first timing data to locate in said syntactical descriptionthe elements corresponding to said time positions,

generating a filtered syntactical description in which the locatedelements are removed,

generating a filtered bitstream from said filtered syntacticaldescription.

Instead of adding specific markers to the syntactical description, theinvention uses a separate semantic description of the bitstream.Advantageously, this semantic description is compliant with the MPEG-7standard. The time position of the elementary units is used as linkingmechanism between the semantic description and the syntacticaldescription: the elementary units that match the user specification areidentified by searching the semantic description; then the timepositions of the matching elementary units are determined; and finallythe determined time positions are used to locate the correspondingelements in the syntactical description.

By doing so, the user is not limited to specific markers for definingthe filtering specification. This is more convenient for the user.

All the metadata contained in the semantic description are used forfiltering, which brings more flexibility.

In many applications, audio/video bitstreams are associated with aMPEG-7 description. It is advantageous to use this existing andstandardized description instead of enhancing the syntacticaldescription with specific markers.

In an advantageous embodiment, said syntactical description is an XMLdocument (extensible Markup Language) and said filtered syntacticaldescription is generated by applying to said syntactical description aparametric transformation defined in an XSL style sheet (extensibleStyleSheet) having said time positions as input parameters. XML and XSLare defined by the W3C consortium.

An XSL style sheet is a text file, written in the XML mark-up language.XSL style sheets were specifically designed to transform XML documents:they contain instructions to be applied by an XSL processor to output atransformed XML document from an input XML document.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be further described with reference to theaccompanying drawings:

FIG. 1 is a block diagram describing a filtering method according to theinvention.

FIG. 2 is block diagram of a first embodiment of a system according tothe invention,

FIG. 3 is block diagram of a second embodiment of a system according tothe invention.

DESCRIPTION OF A PREFERRED EMBODIMENT

A method of filtering a bitstream according to a user specification willnow be described. This method uses:

a semantic description of the bitstream,

a syntactical description of the bitstream.

A semantic description of a bitstream comprises metadata relating to thecontent of the bitstream and giving a meaningful description of saidcontent. MPEG-7 is a well-known standard for semantic descriptions ofaudio/video content. The creation of such a semantic descriptioninvolves human participation. The semantic description is usuallycreated once at the stage of the bitstream generation, and then appendedto the bitstream.

A MPEG-7 semantic description may comprise elements called<CreationInformation> carrying author-generated information about thecontent. This information is not explicitly depicted in the content, andusually cannot be extracted from the content. The <CreationInformation>elements notably contain a sub-element called <Classification>. Theobject of the <Classification> element is to give descriptions allowingclassification of the content. For instance, the following descriptionsare proposed in MPEG-7:

<Genre>: describes one genre that applies to the content,

<Subject>: describes the subject of the content with a textualannotation,

<MarketClassification>: describes one targeted market for the content,

<AgeClassification>: describes the target age range for the content,

<ParentalGuidance>: describes one parental guidance for the content,

<Media review>: describes review of the content.

The contents of all these elements are advantageously used ascharacterizing data.

A MPEG-7 semantic description also comprises elements called <MediaTime>carrying timing data relating to the bitstream. These timing data arethe second timing data of the invention. MPEG-7 proposes several formatsfor defining said second timing data. One example will be given below.

A syntactical description of a bitstream describes the structure of thebitstream. Advantageously, such a syntactical description is generatedautomatically from the bitstream and from a model describing the syntaxof the bitstream format. Such a syntactical description can be generatedonce and appended to the bitstream. It can also be generated by anapplication when required. The ISO document “Bitstream SyntaxDescription Language” by Sylvain Devillers, Myriam Amielh, and ThierryPlanterose (reference ISO/IEC JTC1/SC29/WG11 MPEG/M8273, Fairfax, May2002), describes a method of generating a syntactical description of abitstream from a model describing the syntax of the bitstream format(and reciprocally for generating a bitstream from a syntacticaldescription of said bitstream and from the model describing the syntaxof the bitstream format).

In the continuation of the description the generation of the syntacticaldescription of the bitstream is regarded as being a step of thefiltering method. This is not restrictive. The syntactical descriptioncan also be appended to the bitstream to be filtered.

FIG. 1 is a description in blocks of a method of filtering a bitstreamBST according to a user specification UP. The user specification UP is aset of one or more key words. The bitstream BST comprises elementaryunits and first timing data from which a time position can be derivedfor each elementary unit.

The bitstream BST is semantically described in a semantic descriptionSEM, and syntactically described in a syntactical description SYN.

The semantic description SEM comprises second timing data andcharacterizing data relating to one or more elementary units. The secondtiming data are indicative of the time positions of the elementaryunits. The syntactical description comprises elements describing theelementary units and containing the first timing data.

As indicated in FIG. 1, the filtering method of the invention comprisesfour steps S1, S2, S3 and S4.

At step S1, the syntactical description SYN is generated from thebitstream BST.

At step S2, the semantic description SEM is searched for characterizingdata that match the user specification UP. The elementary units MEi towhich the matching characterizing data relates are called matchingelementary units. The second timing data D2(MEi) relating to thematching elementary units are used to derive a time position TP(i) foreach matching elementary unit. Said time positions are used as inputparameters at step S3.

At step S3, the syntactical description SYN is scanned to detect theelements ETj that have first timing data D1(ETj) corresponding to thetime positions TP(i) derived at step S2. A filtered syntacticaldescription FSYN is generated in which said elements are removed.

At step S4, a filtered bitstream FBST is generated from the filteredsyntactical description FSYN. For example, the filtered bitstream FBSTis generated as indicated in the above described document.

An example will now be given for illustrative purposes. In this example,the bitstream is compliant with the MPEG-4 standard. This is notrestrictive. The invention is applicable to other encoding formats.

The elementary units of a MPEG-4 video are called Video Object Plane(VOP). A syntactical description of the illustrative bitstream is givenbelow:

<Bitstream   xml:base=“ http://www.mpeg7.org/the_video.mpg”  xmlns=“MPEG4”   xmlns:mp4=“MPEG4”  xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”  xsi:schemaLocation=“ http://www.example.org/MPEG4  Schemas/MPEG4.xsd”>     <VO>       ...     </VO>     <VOL>       ...      < VOP_time_increment_resolution > 40       </VOP_time_increment_resolution>       ...      <fixed_VOP_rate>1</fixed_VOP_rate>      <fixed_VOP_time_increment> 1       </fixed_VOP_time_increment >      ...     </VOL>     <VOP>       <StartCode>000001B6</StartCode>      <Type>0</Type>       <Stuffing>16</Stuffing>      <Payload>100–4658</Payload>     </VOP>     <VOP>      <StartCode>000001B6</StartCode>       <Type>1</Type>      <Stuffing>17</Stuffing>       <Payload>4664–4756</Payload>    </VOP>     ... </Bitstream>

This illustrative syntactical description contains <VOP> elementsrepresenting elementary units, and first timing data. The first timingdata are contained in:

the <VOP_time_increment_resolution> element,

the <fixed_VOP_rate> element,

the <fixed_VOP_time_increment> element.

The <VOP_time_increment_resolution> indicates the number of ticks withinone second. Thus, in this example, one second is divided into 40 ticks.

The <fixed_VOP_rate> is a one-bit flag which indicates whether all VOPsare coded with a fixed VOP rate. When it is equal to “1”, all thedistances between the display time of any two successive VOPs in thedisplay order are constant.

The <fixed VOP_time_increment> indicates the number of ticks between twosuccessive VOPs in the display order. In this example, one VOP isdisplayed every 25 ms ( 1/40 s).

A semantic description of the illustrative bitstream will now be givenbelow. This semantic description is compliant with the MPEG-7 standard:

<Mpeg7>   <Description xsi:type=“ContentEntityType”>    <MultimediaContent xsi:type=“VideoType”>       <Video>        <MediaLocator>          <MediaUri>http://www.mpeg7.org/the_video.mpg</MediaUri>        </MediaLocator>         <CreationInformation>          <Creation>             <Title> Basic Instinct </Title>          </Creation>           </CreationInformation>         ↑  <VideoSegment>         |    <MediaTime>        |      <MediaRelTimePoint mediaTimeBase=“//MediaLocator[1]”>       S1|          PT0S         |      </MediaRelTimePoint>        |      <MediaDuration>         |          PT15M20S        |      </MediaDuration>         |    </MediaTime>        |  </VideoSegment>          ↓          ↑  <VideoSegment>        |    <CreationInformation>         |      <Classification>        |        <ParentalGuidance>       S2|            <MinimumAge>18</MinimumAge>        |        </ParentalGuidance>         |      </Classification>        |    </CreationInformation>         |    <MediaTime>        |      <MediaRelTimePoint mediaTimeBase=“//MediaLocator[1]”>        |          PT15M20S         |      </MediaRelTimePoint>        |      <MediaDuration>         |          PT1M30S        |      </MediaDuration>          ↓    </MediaTime>          </VideoSegment>           ...       </Video>    </MultimediaContent>   </Description> </Mpeg7>

This illustrative semantic description comprises two video segments S1and S2, each of them corresponding to a plurality of VOPs. Each videosegment comprises second timing data contained in:

the <MediaRelTimePoint> element,

the <MediaDuration> element.

The <MediaRelTimePoint> element indicates the start time of the videosegment by reference to a time base. The time base is the starting timeof the video. The first video segment S1 starts at time PT0S (0 second).The second video segment S2 starts at time PT15M20S (15 minutes 20seconds, or 920 seconds).

The <MediaRelTimePoint> element indicates the duration of the videosegment. The duration of the first video segment S1 is equal toPT15M20S. The duration of the second video segment S2 is PT1M30S.

The second video segment S2 contains characterizing data in the<MinimumAge> element. According to these characterizing data, theminimum recommended age for watching this second video segment S2 is 18.

Let us assume that the user specifies that the scenes not recommendedunder 18 must be deleted. First the semantic description is scanned. Foreach video segment, if the minimum age is higher than or equal to 18,the time position of the video segment is derived from the second timingdata. In the illustrative example, all VOPs contained in the secondvideo segment S2 are matching elementary units. Their time positionscorrespond to the time interval [920-1010] (it is derived from thesecond timing data contained in the semantic description: start time andduration of the video segment). Then the first timing data contained inthe syntactical description are used to identify the VOPs to be deleted.As mentioned above, in this example, the first timing data indicate thatone VOP is displayed every 25 ms. Therefore, the time positions[920-1010] correspond to VOP number 36800 till VOP number 40400.

Now an example of a parametric XSL style sheet that may be applied toremove the matching VOPs will be described. The following style sheetdefines two parameters firstVOPNumber and lastVOPNumber. It is appliedto remove all the VOPs whose time position is between the valuesfirstVOPNumber and lastVOPNumber. In the above described example, thevalue of the two parameters are:

firstVOPNumber=920/0.025=36 800

lastVOPNumber=1010/0.025=40 400

<?xml version=“1.0”?> <xsl:stylesheet  xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”   xmlns:m=“MPEG4”  version=“1.0”>  <!-- Parameters-->  <xsl:paramname=“firstVOPNumber”>0</xsl:param>  <xsl:paramname=“lastVOPNumber”>0</xsl:param>  <!-- Match all: default template --> <xsl:template name=“tplAll” match=“@*|node( )”>   <xsl:copy>   <xsl:apply-templates select=“@*|node( )”/>   </xsl:copy> </xsl:template>  <!-- Match root element -->  <xsl:templatematch=“m:Bitstream”>     <xsl:copy>    <xsl:apply-templatesselect=“@*|node( )”/>   </xsl:copy>  </xsl:template>  <!-- MatchfirstVOPNumber VOP to lastVOPNumber VOP -->  <xsl:templatename=“tpl_VOP_NtoM”      match=“m:VOP[position( )>firstVOPNumber     and position( )<lastVOPNumber]”>   <!-- Nothing ! --> </xsl:template> </xsl:stylesheet>

A first embodiment of a system according to the invention is representedschematically in FIG. 2. This system comprises a server device SX, atransmission channel CX and a user device TX. In this embodiment, theuser device TX sends a demand for a content to the server device SX viathe transmission channel CX. The demand DX comprises the userspecification UP. Upon reception of the demand DX, the server device SXrecovers the bitstream that corresponds to the demanded content, filtersthe recovered bitstream according to the user specification as describedabove, and sends the filtered bitstream FBST to the user device TX viasaid transmission channel CX. Thus the filtering is done at the serverend.

A second embodiment of the invention is represented schematically inFIG. 3. This system comprises a server device SX, a transmission channelCX and a user device TX. In this embodiment, the user device TX receivesa bitstream BST and a semantic description SEM of the bitstream BST fromthe server device SX via the transmission channel CX. Locally, a userspecification UP is captured and a syntactical description SYN of thebitstream BST is generated. Then the bitstream BST is filtered asdescribed above, and the corresponding filtered bitstream FBST isgenerated. Thus the filtering is done at the user end.

In another embodiment (not represented here), the user device receivesthe syntactical description SYN of the bitstream beforehand instead ofthe bitstream itself. Thus it does not have to generate the syntacticaldescription of the bitstream.

Advantageously, the above-described steps are implemented by means ofsets of instructions being executable under the control of one or morecomputers or digital processors.

It is to be noted that, with respect to the described devices andfiltering method, modifications or improvements may be proposed withoutdeparting from the scope of the invention. The invention is thus notlimited to the examples provided. It is not restricted to the use of anyparticular format, standard or language. It is not restricted to videocontent.

More particularly, in the example given above, a specific type ofcorrelation was described between the first timing data, the timeposition, and the second timing data. This is not restrictive. The firsttiming data vary with the encoding format of the bitstream. The secondtiming data described above are one of the format proposed in the MPEG-7standard. However, other formats are available in the same standard, andother standards or types of descriptions may be used. The only necessarycondition is that a time position may be derived from both the firsttiming data and the second timing data.

Use of the verb to “comprise” and its conjugations does not exclude thepresence of elements or steps other than those stated in the claims.

1. A method of filtering a bitstream having elementary units having atime position, and first timing data indicative of said time positions,a syntactical description of said bitstream, said syntacticaldescription having elements describing said elementary units andcontaining said first timing data, a semantic description of saidbitstream, said semantic description comprising second timing data andcharacterizing data relating to one or more elementary units, saidsecond timing data being indicative of the time positions of saidelementary units, at least a user specification, said method comprisingthe steps of: by a filtering processor, searching in said semanticdescription for the characterizing data that match said userspecification to identify matching elementary units, deriving timepositions for said matching elementary units from said second timingdata, using said first timing data to locate in said syntacticaldescription the elements corresponding to said time positions,generating a filtered syntactical description in which the locatedelements are removed, generating a filtered bitstream from said filteredsyntactical description.
 2. A filtering method as claimed in claim 1,wherein said syntactical description is an XML document and saidfiltered syntactical description is generated by applying to saidsyntactical description a parametric transformation defined in an XSLstyle sheet having said time positions as input parameter.
 3. Afiltering method as claimed in claim 1, wherein said semanticdescription is compliant with the MPEG-7 standard, and said secondtiming data are contained in <MediaTime> elements.
 4. A device forfiltering a bitstream comprising: elementary units having a timeposition, and first timing data indicative of said time positions, asyntactical description of said bitstream, said syntactical descriptionhaving elements describing said elementary units and containing saidfirst timing data, a semantic description of said bitstream, saidsemantic description having second timing data and characterizing datarelating to one or more elementary units, said second timing data beingindicative of the time positions of said elementary units, at least auser specification, a filtering processor, said filtering processorconfigured for searching in said semantic description for thecharacterizing data that match said user specification to identifymatching elementary units, deriving time positions for said matchingelementary units from said second timing data, using said first timingdata to locate in said syntactical description the elementscorresponding to said time positions, generating a filtered syntacticaldescription in which the located elements are removed, generating afiltered bitstream from said filtered syntactical description.
 5. Atransmission system comprising: a server device, a transmission channel,a user device, said user device being intended to receive, from saidserver device via said transmission channel, a bitstream comprisingelementary units having a time position and first timing data indicativeof said time positions, and a semantic description of said bitstream,said semantic description comprising second timing data andcharacterizing data relating to one or more elementary units, saidsecond timing data being indicative of the time positions of saidelementary units, said user device having a processor for capturing atleast a user specification, generating a syntactical description of saidbitstream, said syntactical description comprising elements describingsaid elementary units and containing said first timing data, searchingin said semantic description for the characterizing data that match saiduser specification to identify matching elementary units, deriving timepositions for said matching elementary units from said second timingdata, using said first timing data to locate in said syntacticaldescription the elements corresponding to said time positions,generating a filtered syntactical description in which the locatedelements are removed, generating a filtered bitstream from said filteredsyntactical description.
 6. A transmission system comprising: a serverdevice, a transmission channel, a user device, said user device havingmeans for sending a demand for a content to said server device via saidtransmission channel, said demand including a user specification, andsaid server device having a processor for filtering a bitstreamcorresponding to the demanded content according to said userspecification and for sending the filtered bitstream to said user devicevia said transmission channel, wherein said bitstream includeselementary units having a time position and first timing data indicativeof said time positions, is semantically described in a semanticdescription comprising second timing data and characterizing datarelating to one or more elementary units, said second timing data beingindicative of the time positions of said elementary units, issyntactically described in a syntactical description comprising elementsdescribing said elementary units and containing said first timing data,and, said processor for filtering the bitstream that correspond to thedemanded content is configured for searching in said semanticdescription for the characterizing data that match said userspecification to identify matching elementary units, deriving timepositions for said matching elementary units from said second timingdata, using said first timing data to locate in said syntacticaldescription the elements corresponding to said time positions,generating a filtered syntactical description in which the locatedelements are removed, generating a filtered bitstream from said filteredsyntactical description.